Identifying Groups of Patients with Similar Physiological Characteristics and Risk Profiles

ABSTRACT

The invention relates in part to methods for partitioning a plurality of patients into risk profile groups comprising the steps of: recording a physiological signal from each patient of a plurality of patients; segmenting the physiological signal into a plurality of components for each patient of a plurality of patients; grouping the components into a plurality of information classes for each patient of a plurality of patients; assigning a representation to each information class for each patient of a plurality of patients; and grouping the patients in response to the representations of their respective information classes.

CROSS-REFERENCE TO RELATED APPLICATION

This application claims priority to and the benefit of U.S. ProvisionalPatent Application No. 61/081,445, filed Jul. 17, 2008 and U.S.Provisional Patent Application No. 61/081,437, filed Jul. 17, 2008, theentire disclosures of each of which are hereby incorporated by referenceherein.

BACKGROUND

Many different techniques exist for the assessment of cardiac risk usinginformation in the ECG signal. Measures such as heart rate variability,heart rate turbulence, t-wave alternans and morphologic variability havebeen shown to be good risk-stratifiers for future cardiovascular eventsand focus on calculating a particular feature from the raw ECG signal torank patients. However, current techniques are typically insufficientfor predicting future heath and risk of mortality.

SUMMARY OF THE INVENTION

In one aspect the invention relates to an approach to estimate thedifference in electrocardiographic activity recorded during thelong-term monitoring of two patients. In one embodiment this metric istermed the electrocardiographic mismatch (EM) between the patients. Inone embodiment EM is computed by first creating a symbolicrepresentation of the electrocardiographic signal for each patient, andby then carrying out a weighted inter-patient comparison of the symboldistributions. The resulting electrocardiographic mismatch value servesas an indicator of how different the patients areelectrocardiographically.

In one aspect the EM is used to partition a population of patients intosub-groups comprising individuals with similar cardiac characteristicsand risk profiles. This is done by using clustering to group togetherpatients with low EM values relative to each other.

When evaluated on a population of 686 patients who had suffered fromnon-ST elevation acute coronary syndromes, hierarchical clustering withEM was able to identify abnormal patients at increased risk of death andmyocardial infarction over a 90 day follow-up period. The 20% patientsmost different from the majority cluster were at 5.3 times increasedrisk of death (p-value: 0.003) relative to other members of thepopulation, and had a 2.8 times increased risk for the combined endpointof death and myocardial infarction (p-value: 0.003).

In one aspect, the invention relates to a method of partitioning aplurality of patients into risk profile groups. The method can includethe steps of recording a physiological signal from each patient of aplurality of patients; dividing each physiological signal into aplurality of equivalent time portions for each patient of the pluralityof patients; assigning a representation to each portion of the pluralityof equivalent time portions for each respective physiological signal foreach respective patient of the plurality of patients; and grouping thepatients in response to the representations of their respectivephysiological signals. Physiological signals include, for example,physiological signals. In some embodiments of the method, therepresentation is a numerical value. In some embodiments of the method,the representation is a symbol. In some embodiments, the physiologicalsignal is an ECG and the equivalent time portion is a heartbeat.

In one aspect, the invention relates to a method of partitioning aplurality of patients into risk profile groups. The method includes thesteps of recording a physiological signal from each patient of aplurality of patients; segmenting the physiological signal into aplurality of components for each patient of a plurality of patients;grouping the components into a plurality of information classes for eachpatient of a plurality of patients; assigning a representation to eachinformation class for each patient of a plurality of patients; andgrouping the patients in response to the representations of theirrespective information classes. In some embodiments of the method, therepresentation is a numerical value. In some embodiments of the method,the representation is a symbol. In some embodiments, the physiologicalsignal is an ECG and the equivalent time portion is a heartbeat. In someembodiments, the representation is a waveform such as, for example, aprototype (archetype) waveform or a centrotype waveform.

In some embodiments, the grouping step includes measuring anelectrocardiographic mismatch between each pair of patients in theplurality of patients; assigning each patient of the plurality ofpatients to a respective cluster of a plurality of clusters; groupingclusters in response to the electrocardiographic mismatch until thenumber of clusters reaches a predefined minimum; and assigning riskoutcomes to each of the clusters. In some embodiments, the grouping ofclusters is performed hierarchically. In some embodiments, the groupingof clusters is performed using, for example, fuzzy clustering, max-minclustering, k-means clustering, or svm clustering.

In one aspect, the invention provides a method of partitioning aplurality of patients into risk profile groups. The method includes thesteps of recording an ECG signal from each patient of a plurality ofpatients; dividing each ECG signal into a plurality of heartbeats foreach patient of the plurality of patients; grouping heartbeats intoclusters for each patient of the plurality of patients; assigning arepresentation to each cluster for each patient of the plurality ofpatients; assigning a value to differences in the representation ofclusters for each respective ECG signal for each respective patient ofthe plurality of patients; and grouping the patients in response to thevalues of their respective differences in ECG signals.

In one aspect, the invention relates to a method of partitioning aplurality of patients into risk profile groups. The method includes thesteps of recording an physiological signal from each patient of aplurality of patients; dividing each physiological signal into aplurality of equivalent time portions for each patient of the pluralityof patients; grouping time portions into clusters for each patient ofthe plurality of patients; assigning a representation to each clusterfor each patient of the plurality of patients; assigning a value todifferences in the representation of clusters for each respectivephysiological signals signal for each respective patient of theplurality of patients; and grouping the patients in response to thevalues of their respective differences in physiological signals.

In another aspect, the invention relates to a method of assigning a riskscore to new patients by matching them to an existing database ofpatients. The method includes the steps of grouping patients in thedatabase using the methods described herein; segmenting thephysiological signal into a plurality of components for each newpatient; grouping the components into a plurality of information classesfor each new patient; assigning a representation to each informationclass for each new patient; matching the representations of new patientswith representations of groups of patients from the database; andassigning new patients the risk characteristics of patients in thegroups of patients in the database with matching representations.

BRIEF DESCRIPTION OF THE DRAWINGS

The aspects, embodiments, and features of the invention can be betterunderstood with reference to the drawings described herein. The drawingsare provided to highlight specific embodiments of the invention and arenot intended to limit the invention, the scope of which is defined bythe claims.

FIG. 1 shows Kaplan-Meier survival curves for (a) Death, (b) MI and (c)Death/MI. The survival curves of the low risk group (n=460) shown as adashed line with the survival curves of the high risk group (n=226)shown as a solid line, in accordance with an illustrative embodiment ofthe invention.

FIG. 2 is a graph showing that event rates in high risk population ascutoff is varied, in accordance with an illustrative embodiment of theinvention.

DESCRIPTION OF A PREFERRED EMBODIMENT

These and other aspects, embodiments, and features of the invention arefurther described in the following sections of the application, whichare provided to highlight specific embodiments of the invention and arenot intended to limit the invention. Other embodiments are possible andmodifications may be made without departing from the spirit and scope ofthe invention. In addition, the use of sections in the application isnot meant to limit the invention; each section can apply to any aspect,embodiment, or feature of the invention.

It should be understood that the order of the steps of the methods ofthe invention is immaterial so long as the invention remains operable.Moreover, two or more steps may be conducted simultaneously or in adifferent order than recited herein unless otherwise specified.

There is considerable evidence to suggest that information in theelectrocardiographic (ECG) signal may have prognostic value for cardiacpatients. Techniques such as heart rate variability, heart rateturbulence, t-wave alternans and morphologic variability, have all beenshown to be good risk-stratifiers for future cardiovascular events inthe setting of a prior acute coronary syndrome. The focus of thesemethods is to calculate a particular feature from the raw ECG signal,and to use it to rank patients along a risk scale.

This invention discussed herein relates to a comparative approach toidentify abnormal patients at increased risk of adverse cardiovascularoutcomes. In contrast to deriving a feature from the ECG for eachpatient, we directly compare the signals for every pair of patients todetermine how different they are, i.e., how much electrocardiographicmismatch exists. Patients with abnormal cardiac characteristics thencorrespond to those individuals whose long-term electrocardiogram didnot match a dominant group in the population.

The invention relates to a means to obtain a quantifiable comparison ofhow different two patients are electrocardiographically. The focus ofthe invention is to use this information to partition patients intosimilar groups, with matching long-term electrocardiograms. Theunderlying hypothesis here is that patients with ECG signals that matchin morphology and dynamics will have consistent risk profiles. Thisallows one to obtain a more fine-grained understanding of how apatient's health will evolve over time, and more accurately assign arisk score to the patient for events such as death and myocardialinfarction.

The application is organized as follows. Section 2 describes howelectrocardiographic mismatch is computed through comparisons ofsymbolic distributions of ECG. Section 3 details a hierarchicalclustering approach that is able to partition patients into groups withdifferent risk profiles for future cardiovascular events. Section 4discusses evaluation and Section 5 presents the results of this study.Section 6 compares the results to related efforts. Section 7 presents adiscussion and conclusions.

Section 2: Electrocardiographic Mismatch

The electrocardiographic mismatch (EM) between two patients, p and q, iscalculated using a two-step process.

Symbolization

As a first step, the ECG signal for each patient is symbolized using thetechniques previously described (Syed Z, Guttag J, Stultz C. Clusteringand symbolic analysis of cardiovascular signals: discovery andvisualization of medically relevant patterns in long-term data withlimited prior knowledge. EURASIP Journal on Applied Signal Processing,2007 b). Symbolization involves segmenting the original ECG signal intoheart beats, and then separating the beats into different groups basedon their morphology and assigning a representation to each beat. In oneembodiment the representation is a symbol. In another embodiment therepresentation is value. This method is disclosed in detail inprovisional application No. 61/081,437 (attorney docket numberMIT-184PR) by Syed et al., entitled Motif Discovery in PhysiologicalDatasets: A Methodology for Inferring Predictive Elements; filed on Jul.17, 2008; assigned to the same party of record as this application; andincorporated herein in its entirety.

The process of comparing morphology between beats is carried out using adynamic time-warping (DTW) algorithm. (Myers C, Rabiner L. A comparativestudy of several dynamic time-warping algorithms for connected wordrecognition. The Bell System Technical Journal. (1981) 60:1389-1409.)Given two beats, x₁ and x₂, of length l₁ and L₂ respectively, DTWproduces the optimal alignment of the two sequences by firstconstructing an l₁-by-l₂ distance matrix d. Each entry (i,j) in thismatrix represents the square of the difference between samples x₁[i] andX₂[j]. A particular alignment then corresponds to a path, φ, through thedistance matrix of the form:

φ(k)=(φ₁(k),φ₂(k)), 1≦k≦K   (1)

where φ₁ and φ₂ represent row and column indices into the distancematrix, and K is the alignment length.

The optimal alignment produced by DTW minimizes the overall cost:

$\begin{matrix}{{C\left( {x_{1},x_{2}} \right)} = {\min\limits_{\phi}{C_{\phi}\left( {x_{1},x_{2}} \right)}}} & (2)\end{matrix}$

where C_(φ) is the total cost of the alignment path φ and is defined as:

$\begin{matrix}{{C_{\phi}\left( {x_{1},x_{2}} \right)} = {\sum\limits_{k = 1}^{K}{d\left( {{x_{1}\left\lbrack {\phi_{1}(k)} \right\rbrack},{x_{2}\left\lbrack {\phi_{2}(k)} \right\rbrack}} \right)}}} & (3)\end{matrix}$

The search for the optimal path is carried out in an efficient mannerusing dynamic programming. (Cormen T, Leiserson C, Rivest R, Stein C.Introduction to Algorithms. MIT Press and McGraw-Hill. 2001; 2^(nd) ed.)The final energy difference between the two beats x₁ and x₂, is given bythe cost of their optimal alignment, and depends on both the amplitudedifferences between the two signals, as well as the length K of thealignment (which increases if the two beats differ in their timingcharacteristics).

In this way, the DTW approach described here measures changes inmorphology resulting from amplitude and timing differences between thetwo beats. Using this information, beats with distinct morphologies areplaced in different groups and each group is assigned a unique label orsymbol. Additional description of the symbolization process is providedin Syed Z, Guttag J, Stultz C. Clustering and symbolic analysis ofcardiovascular signals: discovery and visualization of medicallyrelevant patterns in long-term data with limited prior knowledge.EURASIP Journal on Applied Signal Processing, 2007 b. The final resultof this step is that the original electrocardiogram is transformed fromraw samples to a sequence of symbols.

Comparing Symbol Distributions

Denoting the set of symbols for patient p as S_(p) and the set ofprobabilities with which these symbols occur in the electrocardiogram asP_(p) (for patient q an analogous representation is adopted), wecalculate the EM between these patients as:

$\begin{matrix}{{EM}_{p,q} = {\sum\limits_{a \in S_{p}}{\sum\limits_{b \in S_{q}}{D\; T\; {W\left( {a,b} \right)}{P_{p}\lbrack a\rbrack}{P_{q}\lbrack b\rbrack}}}}} & (4)\end{matrix}$

In (4), DTW(a,b) corresponds to the dynamic time-warping cost ofaligning symbols a and b.

Intuitively, the electrocardiographic mismatch between patients p and qcorresponds to an estimate of the expected dynamic time-warping cost ofaligning any two randomly chosen beats from these patients. The EMcalculation in (4) achieves this by weighting the cost between everypair of symbols between the patients by the probabilities with whichthese symbols occur.

Section 3. Hierarchical Clustering of Patients Using EM

For every pair of patients in a population, the electrocardiographicmismatch between them is computed using the techniques described inSection 2. The resulting divergence matrix, D, relating the pairwiseelectrocardiographic mismatches between all the patients is used topartition the population into groups with resembling cardiaccharacteristics. In one embodiment this process is carried out by meansof hierarchical clustering. (Duda R, Hart P. Pattern Classification.Wiley-Interscience. 2000; 2^(nd) ed.)

Hierarchical clustering starts out by assigning each patient to acluster of its own. It then proceeds to combine two clusters at everyiteration and terminates when all the patients in the population havebeen amalgamated within a single cluster. A number of different criteriacan be used to determine which two clusters should be combined at eachstep. In our work, we choose the unweighted average linkage (UPGMA)criterion, which corresponds to merging the two clusters A and B forwhich the mean electrocardiographic mismatch between the elements of theclusters is minimized, i.e., we merge clusters A and B such that theyminimize:

$\begin{matrix}{\frac{1}{{A} \cdot {B}}{\sum\limits_{x \in A}{\sum\limits_{y \in B}{EM}_{x,y}}}} & (5)\end{matrix}$

To obtain a clustering at a specific precision, the hierarchicalclustering process can be terminated when the original patientpopulation has been reduced to a given number of clusters. Otherclustering methods known to the prior art are also contemplated for thispurpose. For example, the grouping of clusters can be performedperformed using fuzzy clustering, max-min clustering, k-meansclustering, or svm clustering.

Section 4. Evaluation

The population used for this work comprised patients in the TIMIDISPERSE2 trial (Cannon C, Husted S, Harrington R, Scirica B,Emanuelsson H, Peters G, Storey R. Safety, tolerability, and initialefficacy of AZD6140, the first reversible oral adenosine diphosphatereceptor antagonist, compared with clopidogrel, in patients withnon-ST-segment elevation acute coronary syndrome: primary results of theDISPERSE-2 trial. J Am Coll Cardiol. 2007;50:1844-51), who were admittedto a hospital with non-ST-elevation (NSTE) acute coronary syndromes.Three lead continuous ECG (cECG) monitoring (LifeCard CF/Pathfinder,DelMar Reynolds/Spacelabs, Issaqua Wash.) was performed for a medianduration of 4 days at a sampling rate of 128 Hz. The endpoints of deathand myocardial infarction were adjudicated by a blinded Clinical EventsCommittee for a median follow-up period of 60 days. The maximumfollow-up was 90 days. Data from 686 patients was available afterremoval of noise-corrupted signals.

To evaluate the ability of electrocardiographic mismatch to identifypatients at increased risk of future cardiovascular events, we firstseparated out the patients into a dominant normal sub-population and agroup of abnormal patients. This was done by terminating hierarchicalclustering one iteration before it placed more than 80% of the patientsin the same cluster. This dichotomized the patients into a low riskgroup containing less than 80% of the population, and a high risk groupcontaining the rest.

Kaplan-Meier survival analysis (Machin D, Cheung Y, Parmar M. SurvivalAnalysis: A Practical Approach. Wiley) was used to study the event ratesfor death and myocardial infarction (MI). The outcomes were studied bothseparately, as well as after being combined to create a compositeendpoint of death or MI (death/MI).

Patient physiological signals are separated into discrete components,and each variation of that component is assigned a unique representation(e.g., a number or symbol). For example, ECGs from one or more patientare separated into a plurality of discrete waveforms which correspond toindividual heart beats. Waveforms corresponding to normal heartbeats areeach assigned a unique representation (e.g., the letter N). Waveformscorresponding to abnormal contractions originating from ventricularregions are each assigned a different representation (e.g., the letterV). Further classes of abnormal heart beats are each assigned their ownunique symbols. To accommodate for minor variations in individualwaveforms, all N waveforms are grouped together and a characteristic N(i.e., normal) waveform is extrapolated therefrom. Characteristicwaveforms are also extrapolated for each type of abnormal heartbeat. Thecharacteristic waveforms are then used to evaluate heartbeats in theECGs of new patients.

The characteristic waveform can be a prototype (archetype) waveform or acentrotype waveform. The difference between the prototype and thecentrotype is as follows—the prototype is a waveform we construct thatis the ‘average’ waveform; the centrotype is the waveform of the averageelement. For example, if we want the average of the numbers 1, 4, 10,the prototype approach would be to use 1+4+10 divided by 3 (i.e., wecompute the average). The centrotype approach would be to say that 4 isthe middle element. In some embodiments, the representation is awaveform and the probability of the information class.

Section 5. Results

The results of univariate analysis for death, MI and the combinedoutcome are shown in Table 1. The corresponding Kaplan-Meier curves arepresented in FIG. 1.

TABLE 1 Results of univariate analysis for the outcomes of death, MI anddeath/MI. Endpoint Hazard Ratio 95% CI P Value Death 5.28  1.64-17.020.003 MI 1.81 0.85-3.87 0.150 Death/MI 2.84 1.45-5.56 0.003

As seen in Table 1, patients who were electrocardiographicallymismatched with the dominant group of the population, were at increasedrisk of adverse cardiovascular events. Patients placed automatically inthe high risk group had a much higher rate of death during follow-upthan patients in the low risk group (4.42% vs. 0.87%; p=0.003). Asimilar trend was seen for MI (5.75% vs. 3.26%) although in this casethe relationship was not statistically significant (p=0.149). For thecombined death/MI endpoint, i.e., the occurrence of either of theseadverse outcomes, the cumulative incidence in the high risk group was9.29% as opposed to 3.48% in the low risk group (p=0.003).

In all, hierarchical clustering produced 31 clusters, i.e., one dominantcluster that constituted the low risk group of patients and 30 clustersthat collectively formed the high risk group. Of the high risk clusters,15 had only a single element. No death or MI events were observed inthese groups, and it is likely that these isolated singletonscorresponded to noisy electrocardiograms that were not removed duringthe noise rejection stage described in Section 4. Conversely, 5 of thehigh risk clusters had 10 or more elements. Table 2 presents the risk ofevents for these individual high risk clusters. The data suggests thatpatients in different clusters have distinct risk profiles. For example,in cluster B patients, the risk of death is 17.39% relative to a risk of0.87% in the low risk population. The overall risk of death/MI is alsocorrespondingly elevated (21.74% vs. 3.48%). Similarly, in cluster D,there are no deaths but the risk of MI is 14.29% as opposed to 3.26% inthe low risk population.

TABLE 2 % of patients with events in five largest clusters in high riskgroup. Cluster # of Patients % Death % MI % Death/MI A 101 2.97 4.957.92 B 23 17.39 4.35 21.74 C 21 9.52 4.76 9.52 D 14 0.00 14.29 14.29 E10 10.00 10.00 10.00

The percentage of events in the combined population comprised by thehigh risk clusters with 10 or more members (i.e., clusters A to E) isshown in Table 3. This data suggests that improved noise removaltechniques, or disregarding small electrocardiographically mismatchedclusters, could allow for a further focus on high risk cases.

TABLE 3 % of patients with events in aggregate of five largest clustersin high risk group (n = 169) compared to low risk group (n = 460). # ofPatients % Death % MI % Death/MI 169 6.25 8.13 13.13 460 0.87 3.26 3.48

The results presented so far use a cutoff of 80% to separate out the lowrisk and high risk groups. The effect of varying this cutoff forhierarchical clustering is shown in FIG. 2. Specifically, for eachdecile, hierarchical clustering was terminated before it placed acorresponding percentage of patients in a single dominant cluster. Theevent rate between the high risk and low risk groups was thencalculated. As shown in FIG. 2, increasing the clustering threshold(i.e., choosing a high risk population that is moreelectrocardiographically mismatched from the dominant group) generallyleads to an increased percentage of events in the risk group. Thiseffect tapers off at the final decile, most likely due to the effectdescribed earlier, where some of the exceedingly dissimilarelectrocardiograms correspond to noise.

This invention relates to an alternative approach of identifyingabnormal patients by searching for population outliers. A comparativeframework is adopted that is able to discover patient groups that are atan increased risk of adverse cardiovascular outcomes. In contrast toderiving a feature from the ECG for each patient, signal morphologiesfor every pair of patients to determine how different they are, i.e.,how much electrocardiographic mismatch exists are directly compared.Patients with abnormal cardiac characteristics then correspond to thoseindividuals whose long-term ECG did not match the dominant group in thepopulation. A more fine-grained risk profile for patients based on thespecific cluster they fall within is also developed.

The process of separating out patients into different groups essentiallyclusters individuals with similar ECG morphology and dynamics together.The concept of clustering ECG signals based on morphology has beenproposed earlier, e.g., (Lagerholm M, Peterson C, Braccini G, EdenbrandtL, Sornmo L. Clustering ECG complexes using Hermite functions andself-organizing maps. IEEE Trans Biomed Eng. 2000;47:838-48) and(Cuesta-Frau D, Perez-Cortes J, Andreu-Garcia G. Clustering ofelectrocardiographic signals in computer-aided Holter analysis. ComputerMethods and Programs in Biomedicine. 2003;72: 179-96). The focus ofthese techniques has typically been to cluster individual ECG beatstogether based on their morphology. The current methodology describes amethod that is able to cluster together patients based on the morphologyof the entire electrocardiogram, i.e., inter-patient comparisons basedon ECG morphology as opposed to inter-beat comparisons.

The present invention develops an approach to obtain a quantifiablecomparison of how different two patients are electrocardiographically.This information may be used to partition patients into similar groups,with matching long-term electrocardiograms. The hypothesis underlyingthe work is that patients with ECG signals that match in morphology anddynamics will have consistent risk profiles. This allows one to obtain amore fine-grained understanding of how a patient's health will evolveover time, and more accurately assign a risk score to the patient forevents such as death and myocardial infarction.

An experimental study shows that patients who areelectrocardiographically mismatched from the majority patient populationare at an increased risk of cardiovascular events such as death and MIover a 90 day follow-up period.

It is important to point out that the technique makes almost no a prioriassumptions as to how these high risk patients are different from therest of the population. In other words, no set of specific morphologyclasses occurring exclusively in high risk patients are assumed, nor arepatients who have more or less variability in the distribution ofsymbols sought to be identified. One of the strengths of the method isthat it is able to find a wide variety of abnormalities that would bedifficult to describe along an ordinal scale. One limitation of thisapproach, however, is that patients whose electrocardiograms have beencorrupted by noise will also appear as outliers. This does not lead toany important cases being missed but adds false positives to the highrisk group.

Another significant aspect of the work is that symbolization is used asan intermediate step to calculate EM. The ECG from patients is firstsymbolized, and the distribution of symbols is compared as described inSection 2. The use of symbolization can be considered an optimizationstep. For example, EM can be calculated directly from the raw data bytreating each beat as a distinct symbol. However, the use ofsymbolization greatly reduces the number of comparisons between thebeats of two patients, allowing us to simply compare the representativeelements of each symbol cluster and weight the differences by theprobabilities of these symbols. For example, comparing the rawelectrocardiograms between two patients may involve comparing 100,000beats from the first patient with 100,000 beats from the second (i.e.,100,000² comparisons). However, using symbolization to reduce the datato 50 symbols for each patient would result in 50² comparisons beingneeded to estimate EM.

One of key goals of EM with hierarchical clustering is to partitionpatients into smaller groups with consistent risk profiles. The resultsin Section 5 show that different clusters may be at varying risk forsubsequent cardiovascular events. This allows for a more fine-grainedassessment of individual patients. Specifically, the idea of using anearest neighbor system to assign patients to one of the previousclusters determined by EM with hierarchical clustering is currentlybeing investigated. The approach could allow one to more precisely statewhat the expected risk of an individual patient is, as opposed to merelyplacing them within a group with elevated risk.

Importantly, the formation of groups with smaller risk profiles permitsnew patients to be assigned a risk score based on which patient groupthis new patient best matches to. Thus, the invention provides methodsof assigning a risk score to new patients by matching them to anexisting database of patients, comprising the following steps: groupingpatients in the database as described herein; segmenting thephysiological signal into a plurality of components for each newpatient; grouping the components into a plurality of information classesfor each new patient; assigning a representation to each informationclass for each new patient; matching the representations of new patientswith representations of groups of patients from the database assigningnew patients the risk characteristics of patients in the groups ofpatients in the database with matching representations

In this application the invention is discussed generally in terms of amethod for clustering patients according to various physiologicalstates. The invention can be implemented as a physiological (e.g.,electrophysiological) monitor (e.g., ECG) in communication with, forexample, a general purpose computer. The physiological signal data isreceived and stored in a data storage device for subsequent analysis bythe program modules of the computer. Individual program modules includebut not limited to: dividing the signal data into a plurality of timeportions; assigning a representation to each time portion; and groupingthe patients in response to their representations or symbols. It iscontemplated that in another embodiment such program modules may in factbe incorporated into the ECG monitor itself. The data storage device canbe a central database or it can be a local memory device (e.g., computerreadable medium) located on, for example, a computer. The data storagedevice can be in bidirectional communication with the computer such thatthe computer can retrieve physiological data from the data storagedevice and the computer can save physiological data (e.g., new patientdata) and analytical results to the data storage device. The computeroptionally can be in communication with a display for displayingphysiological data, time portions, risk profiles, risk scores,representations, symbols, numbers, waveforms, clustering and otherfeatures as described herein.

Variations, modification, and other implementations of what is describedherein will occur to those of ordinary skill in the art withoutdeparting from the spirit and scope of the invention as claimed.Accordingly, the invention is to be defined not by the precedingillustrative description, but instead by the spirit and scope of thefollowing claims.

1. A method of partitioning a plurality of patients into risk profilegroups comprising the steps of: recording a physiological signal fromeach patient of a plurality of patients; segmenting the physiologicalsignal into a plurality of components for each patient of a plurality ofpatients; grouping the components into a plurality of informationclasses for each patient of a plurality of patients; assigning arepresentation to each information class for each patient of a pluralityof patients; and grouping the patients in response to therepresentations of their respective information classes.
 2. The methodof claim 1 wherein the representation is a numerical value.
 3. Themethod of claim 1 wherein the representation is a symbol.
 4. The methodof claim 1 wherein the representation is a waveform.
 5. The method ofclaim 4 wherein the waveform is a prototype (archetype) waveform.
 6. Themethod of claim 4 wherein the waveform is a centrotype waveform
 7. Themethod of claim 1 wherein the physiological signal is an ECG and theequivalent time portion is a heartbeat.
 8. The method of claim 7 whereinthe step of grouping comprises: measuring an electrocardiographicmismatch between each pair of patients in the plurality of patients;assigning each patient of the plurality of patients to a respectivecluster of a plurality of clusters; grouping clusters in response to theelectrocardiographic mismatch until the number of clusters reaches apredefined minimum; and assigning risk outcomes to each of the clusters.9. The method of claim 8 wherein the grouping of clusters is performedhierarchically.
 10. A method of partitioning a plurality of patientsinto risk profile groups comprising the steps of: recording an ECGsignal from each patient of a plurality of patients; dividing each ECGsignal into a plurality of heartbeats for each patient of the pluralityof patients; grouping heartbeats into clusters for each patient of theplurality of patients; assigning a representation to each cluster foreach patient of the plurality of patients; assigning a value todifferences in the representation of clusters for each respective ecgsignal for each respective patient of the plurality of patients; andgrouping the patients in response to the values of their respectivedifferences in ECG signals.
 11. A method of partitioning a plurality ofpatients into risk profile groups comprising the steps of: recording anphysiological signal from each patient of a plurality of patients;dividing each physiological signal into a plurality of equivalent timeportions for each patient of the plurality of patients; grouping timeportions into clusters for each patient of the plurality of patients;assigning a representation to each cluster for each patient of theplurality of patients; assigning a value to differences in therepresentation of clusters for each respective physiological signalssignal for each respective patient of the plurality of patients; andgrouping the patients in response to the values of their respectivedifferences in physiological signals.
 12. A method of assigning a riskscore to new patients by matching them to an existing database ofpatients, comprising the following steps: grouping patients in thedatabase using the method of claim 1; segmenting the physiologicalsignal into a plurality of components for each new patient; grouping thecomponents into a plurality of information classes for each new patient;assigning a representation to each information class for each newpatient; matching the representations of new patients withrepresentations of groups of patients from the database; and assigningnew patients the risk characteristics of patients in the groups ofpatients in the database with matching representations.